Compiler Correctness
   HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, compiler correctness is the branch of computer science that deals with trying to show that a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
behaves according to its
language specification In computer programming, a programming language specification (or standard or definition) is a documentation artifact that defines a programming language so that users and implementors can agree on what programs in that language mean. Specifica ...
. Techniques include developing the compiler using
formal methods In computer science, formal methods are mathematically rigorous techniques for the specification, development, and verification of software and hardware systems. The use of formal methods for software and hardware design is motivated by the expec ...
and using rigorous testing (often called compiler validation) on an existing compiler.


Formal verification

Two main
formal verification In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal metho ...
approaches for establishing correctness of compilation are proving correctness of the compiler for all inputs and proving correctness of a compilation of a particular program (translation validation).


Compiler correctness for all input programs

Compiler validation with formal methods involves a long chain of formal,
deductive logic Deductive reasoning is the mental process of drawing deductive inferences. An inference is deductively valid if its conclusion follows logically from its premises, i.e. if it is impossible for the premises to be true and the conclusion to be false ...
. However, since the tool to find the proof ( theorem prover) is implemented in software and is complex, there is a high probability it will contain errors. One approach has been to use a tool that verifies the proof (a
proof checker In computer science and mathematical logic, a proof assistant or interactive theorem prover is a software tool to assist with the development of formal proofs by human-machine collaboration. This involves some sort of interactive proof editor ...
) which, because it is much simpler than a proof-finder, is less likely to contain errors. A prominent example of this approach is
CompCert CompCert is a formally verified optimizing compiler for a large subset of the C99 programming language (known as Clight) which currently targets PowerPC, ARM, RISC-V, x86 and x86-64 architectures. This project, led by Xavier Leroy, started o ...
, which is a formally verified optimizing compiler of a large subset of
C99 C99 (previously known as C9X) is an informal name for ISO/IEC 9899:1999, a past version of the C programming language standard. It extends the previous version ( C90) with new features for the language and the standard library, and helps impl ...
. Another verified compiler was developed in CakeML project, which establishes correctness of a substantial subset of
Standard ML Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of the ...
programming language using the
HOL (proof assistant) HOL (Higher Order Logic) denotes a family of Proof assistant, interactive theorem proving systems using similar Higher-order logic, (higher-order) logics and implementation strategies. Systems in this family follow the LCF (theorem prover), LCF a ...
. Another approach to obtain a formally correct compiler is to use semantics-directed compiler generation.


Translation validation: compiler correctness on a given program

In contrast to attempting to prove that a compiler is correct for all valid input programs translation validation aims to automatically establish that a given input program is compiled correctly. Proving correct compilation of a given program is potentially easier than proving a compiler correct for all programs, but still requires symbolic reasoning, because a fixed program may still work on arbitrarily large inputs and run for arbitrarily long amount of time. Translation validation can reuse an existing compiler implementation by generating, for a given compilation, a proof that the compilation was correct. Translation validation can be used even with a compiler that sometimes generates incorrect code, as long as this incorrect does not manifest itself for a given program. Depending on the input program the translation validation can fail (because the generated code is wrong or the translation validation technique is too weak to show correctness). However, if translation validation succeeds, then the compiled program is guaranteed to be correct for all inputs.


Testing

Testing represents a significant portion of the effort in shipping a compiler, but receives comparatively little coverage in the standard literature. The 1986 edition of Aho, Sethi, & Ullman has a single-page section on compiler testing, with no named examples. The 2006 edition omits the section on testing, but does emphasize its importance: “Optimizing compilers are so difficult to get right that we dare say that no optimizing compiler is completely error-free! Thus, the most important objective in writing a compiler is that it is correct.” Fraser & Hanson 1995 has a brief section on
regression testing Regression testing (rarely, ''non-regression testing'') is re-running functional and non-functional tests to ensure that previously developed and tested software still performs as expected after a change. If not, that would be called a '' regre ...
; source code is available. Bailey & Davidson 2003 cover testing of procedure calls A number of articles confirm that many released compilers have significant code-correctness bugs. Sheridan 2007 is probably the most recent journal article on general compiler testing. For most purposes, the largest body of information available on compiler testing are the Fortran and Cobol validation suites. Further common techniques when testing compilers are
fuzzing In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions ...
(which generates random programs to try to find bugs in a compiler) and test case reduction (which tries to minimize a found test case to make it easier to understand).


See also

*
Compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
*
Verification and validation (software) Verify or verification may refer to: General * Verification and validation, in engineering or quality management systems, is the act of reviewing, inspecting or testing, in order to establish and document that a product, service or system meets ...
*
Correctness (computer science) In theoretical computer science, an algorithm is correct with respect to a specification if it behaves as specified. Best explored is ''functional'' correctness, which refers to the input-output behavior of the algorithm (i.e., for each input it p ...
* CompCert C compiler—Formally verified C compiler * Reflections on Trusting Trust


References

{{reflist Compiler construction Compiler theory